{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在自定义搜索空间中，我们提到HyperTS针对不同的模式内置了丰富的建模算法，如果您需要增加对其他算法的支持，可以通过如下步骤进行自定义建模算法:\n",
    "- 将自定义算法封装为```HyperEstimator```的子类;\n",
    "- 将封装后的算法增加到特定任务的SearchSpace中，并定义其搜索参数;\n",
    "- 在```make_experiment```中使用自定义的search_space。\n",
    "\n",
    "假如现在我们想在DLForecastSearchSpace中增加自己的神经网络模型Transformer，示例如下:\n",
    "\n",
    "1. 构建自定义模型\n",
    "2. 构建自定义算法\n",
    "3. 构建算法估计器\n",
    "4. 重构搜索空间\n",
    "5. 使用新搜索空间执行实验\n",
    "\n",
    "--------------------------"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1. 构建自定义模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们基于tensorflow构建一个**Transformer Encoder**, 该结构参看自[Keras官方教程](https://keras.io/examples/timeseries/timeseries_classification_transformer/)。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow.keras import layers\n",
    "\n",
    "def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0.):\n",
    "    x = layers.MultiHeadAttention(\n",
    "        key_dim=head_size, num_heads=num_heads, dropout=dropout\n",
    "    )(inputs, inputs)\n",
    "    x = layers.Dropout(dropout)(x)\n",
    "    x = layers.LayerNormalization(epsilon=1e-6)(x)\n",
    "    res = x + inputs\n",
    "\n",
    "    x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation=\"relu\")(res)\n",
    "    x = layers.Dropout(dropout)(x)\n",
    "    x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)\n",
    "    x = layers.LayerNormalization(epsilon=1e-6)(x)\n",
    "    return x + res"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2. 构建自定义算法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了方便起见，我们可以直接继承HyperTS中已存在的算法，这样除了必要的init部分外，我们只完成```_build_estimator```中的backbone部分即可, 即:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import tensorflow.keras.backend as K\n",
    "from hyperts.framework.dl import layers\n",
    "from hyperts.framework.dl.models import HybridRNN\n",
    "\n",
    "class Transformer(HybridRNN):\n",
    "\n",
    "    def __init__(self, \n",
    "                 task, \n",
    "                 timestamp=None, \n",
    "                 window=7, \n",
    "                 horizon=1, \n",
    "                 forecast_length=1, \n",
    "                 head_size=10,\n",
    "                 num_heads=6,\n",
    "                 ff_dim=10,\n",
    "                 transformer_blocks=1,\n",
    "                 drop_rate=0.,\n",
    "                 metrics='auto',\n",
    "                 monitor_metric='val_loss',\n",
    "                 optimizer='auto',\n",
    "                 learning_rate=0.001,\n",
    "                 loss='auto',\n",
    "                 out_activation='linear',\n",
    "                 reducelr_patience=5, \n",
    "                 earlystop_patience=10, \n",
    "                 embedding_output_dim=4,\n",
    "                 **kwargs):\n",
    "        super(Transformer, self).__init__(task=task, \n",
    "                                          timestamp=timestamp, \n",
    "                                          window=window, \n",
    "                                          horizon=horizon, \n",
    "                                          forecast_length=forecast_length,\n",
    "                                          drop_rate=drop_rate,\n",
    "                                          metrics=metrics, \n",
    "                                          monitor_metric=monitor_metric, \n",
    "                                          optimizer=optimizer,\n",
    "                                          learning_rate=learning_rate, \n",
    "                                          loss=loss, \n",
    "                                          out_activation=out_activation, \n",
    "                                          reducelr_patience=reducelr_patience, \n",
    "                                          earlystop_patience=earlystop_patience,\n",
    "                                          embedding_output_dim=embedding_output_dim, \n",
    "                                          **kwargs)\n",
    "        self.head_size = head_size\n",
    "        self.num_heads = num_heads\n",
    "        self.ff_dim = ff_dim\n",
    "        self.transformer_blocks = transformer_blocks\n",
    "\n",
    "    \n",
    "    def _build_estimator(self, **kwargs):\n",
    "        K.clear_session()\n",
    "        continuous_inputs, categorical_inputs = layers.build_input_head(self.window, self.continuous_columns, self.categorical_columns)\n",
    "        denses = layers.build_denses(self.continuous_columns, continuous_inputs)\n",
    "        embeddings = layers.build_embeddings(self.categorical_columns, categorical_inputs)\n",
    "        if embeddings is not None:\n",
    "            x = layers.Concatenate(axis=-1, name='concat_embeddings_dense_inputs')([denses, embeddings])\n",
    "        else:\n",
    "            x = denses  \n",
    "\n",
    "        ############################################ backbone ############################################\n",
    "        for _ in range(self.transformer_blocks):\n",
    "            x = transformer_encoder(x, self.head_size, self.num_heads, self.ff_dim, self.drop_rate)\n",
    "        x = layers.GlobalAveragePooling1D(data_format=\"channels_first\")(x)\n",
    "        ##################################################################################################\n",
    "\n",
    "        outputs = layers.build_output_tail(x, self.task, nb_outputs=self.meta.classes_, nb_steps=self.forecast_length)\n",
    "        outputs = layers.Activation(self.out_activation, name=f'output_activation_{self.out_activation}')(outputs)\n",
    "\n",
    "        all_inputs = list(continuous_inputs.values()) + list(categorical_inputs.values())\n",
    "        model = tf.keras.models.Model(inputs=all_inputs, outputs=[outputs], name=f'Transformer')\n",
    "        model.summary()\n",
    "        return model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3. 构建算法估计器"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "估计器将是连接算法模型与搜索空间的桥梁与纽带，它可以规定哪些超参数将能够被搜索优化。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from hyperts.utils import consts\n",
    "from hyperts.framework.wrappers.dl_wrappers import HybridRNNWrapper\n",
    "from hyperts.framework.estimators import HyperEstimator\n",
    "\n",
    "class TransformerWrapper(HybridRNNWrapper):\n",
    "\n",
    "    def __init__(self, fit_kwargs, **kwargs):\n",
    "        super(TransformerWrapper, self).__init__(fit_kwargs, **kwargs)\n",
    "        self.update_fit_kwargs()\n",
    "        self.model = Transformer(**self.init_kwargs)\n",
    "\n",
    "\n",
    "class TransfomerEstimator(HyperEstimator):\n",
    "\n",
    "    def __init__(self, fit_kwargs=None, timestamp=None, task='univariate-forecast', window=7,\n",
    "                 head_size=10, num_heads=6, ff_dim=10, transformer_blocks=1, drop_rate=0.,\n",
    "                 metrics='auto', optimizer='auto', out_activation='linear',\n",
    "                 learning_rate=0.001, batch_size=None, epochs=1, verbose=1,\n",
    "                 space=None, name=None, **kwargs):\n",
    "\n",
    "        if task in consts.TASK_LIST_FORECAST and timestamp is None:\n",
    "            raise ValueError('Timestamp need to be given for forecast task.')\n",
    "        else:\n",
    "            kwargs['timestamp'] = timestamp\n",
    "        if task is not None:\n",
    "            kwargs['task'] = task\n",
    "        if window is not None and window != 7:\n",
    "            kwargs['window'] = window\n",
    "        if head_size is not None and head_size != 10:\n",
    "            kwargs['head_size'] = head_size\n",
    "        if num_heads is not None and num_heads != 6:\n",
    "            kwargs['num_heads'] = num_heads\n",
    "        if ff_dim is not None and ff_dim != 10:\n",
    "            kwargs['ff_dim'] = ff_dim\n",
    "        if transformer_blocks is not None and transformer_blocks != 1:\n",
    "            kwargs['transformer_blocks'] = transformer_blocks\n",
    "        if drop_rate is not None and drop_rate != 0.:\n",
    "            kwargs['drop_rate'] = drop_rate\n",
    "        if metrics is not None and metrics != 'auto':\n",
    "            kwargs['metrics'] = metrics\n",
    "        if optimizer is not None and optimizer != 'auto':\n",
    "            kwargs['optimizer'] = optimizer\n",
    "        if out_activation is not None and out_activation != 'linear':\n",
    "            kwargs['out_activation'] = out_activation\n",
    "        if learning_rate is not None and learning_rate != 0.001:\n",
    "            kwargs['learning_rate'] = learning_rate \n",
    "\n",
    "        if batch_size is not None:\n",
    "                kwargs['batch_size'] = batch_size\n",
    "        if epochs is not None and epochs != 1:\n",
    "            kwargs['epochs'] = epochs\n",
    "        if verbose is not None and verbose != 1:\n",
    "            kwargs['verbose'] = verbose\n",
    "\n",
    "        HyperEstimator.__init__(self, fit_kwargs, space, name, **kwargs)\n",
    "\n",
    "    def _build_estimator(self, task, fit_kwargs, kwargs):\n",
    "        if task in consts.TASK_LIST_FORECAST + consts.TASK_LIST_CLASSIFICATION:\n",
    "            transformer = TransformerWrapper(fit_kwargs, **kwargs)\n",
    "        else:\n",
    "            raise ValueError('Check whether the task type meets specifications.')\n",
    "        return transformer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4. 重构搜索空间"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "将估计器添加到搜索空间中就大功告成啦！在这里，我们可以设置自定义算法中一些超参数的搜索空间，这一步将是发挥算法性能的关键:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from hypernets.core.search_space import Choice, Real\n",
    "from hyperts.framework.search_space.macro_search_space import DLForecastSearchSpace\n",
    "\n",
    "\n",
    "class DLForecastSearchSpacePlusTransformer(DLForecastSearchSpace):\n",
    "\n",
    "    def __init__(self, task=None, timestamp=None, metrics=None, window=None, enable_transformer=True, **kwargs):\n",
    "        super().__init__(task=task, timestamp=timestamp, metrics=metrics, window=window, **kwargs)\n",
    "        self.enable_transformer = enable_transformer\n",
    "\n",
    "    @property\n",
    "    def default_transformer_init_kwargs(self):\n",
    "        return {\n",
    "            'timestamp': self.timestamp,\n",
    "            'task': self.task,\n",
    "            'metrics': self.metrics,\n",
    "\n",
    "            'head_size': Choice([8, 16, 24, 32]),\n",
    "            'num_heads': Choice([2, 4, 6]),\n",
    "            'ff_dim': Choice([8, 16, 24, 32]),\n",
    "            'drop_rate': Real(0., 0.5, 0.1),\n",
    "            'transformer_blocks': Choice([1, 2, 3]),            \n",
    "            'window': self.window if self.window is not None else Choice([12, 24, 48]),\n",
    "\n",
    "            'y_log': Choice(['logx', 'log-none']),\n",
    "            'y_scale': Choice(['min_max', 'max_abs'])\n",
    "        }\n",
    "\n",
    "    @property\n",
    "    def default_transformer_fit_kwargs(self):\n",
    "        return {\n",
    "            'epochs': 60,\n",
    "            'batch_size': None,\n",
    "            'verbose': 1,\n",
    "        }\n",
    "\n",
    "    @property\n",
    "    def estimators(self):\n",
    "        r = super().estimators\n",
    "        if self.enable_transformer:\n",
    "            r['transformer'] = (TransfomerEstimator, self.default_transformer_init_kwargs, self.default_transformer_fit_kwargs)\n",
    "        return r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 5. 使用新搜索空间执行实验"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里将和前面介绍的自定义搜索空间的操作一致。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "方式一："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from hyperts import make_experiment\n",
    "from hyperts.datasets import load_network_traffic\n",
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "df = load_network_traffic(univariate=True)\n",
    "train_data, test_data = train_test_split(df, test_size=168, shuffle=False)\n",
    "\n",
    "custom_search_space = DLForecastSearchSpacePlusTransformer(task='univariate-forecast', \n",
    "                                                           timestamp='TimeStamp',\n",
    "                                                           covariables=['HourSin', 'WeekCos', 'CBWD'],\n",
    "                                                           metrics=['mape'])\n",
    "\n",
    "experiment = make_experiment(train_data, \n",
    "                             task='univariate-forecast',\n",
    "                             mode='dl',\n",
    "                             timestamp='TimeStamp',\n",
    "                             covariables=['HourSin', 'WeekCos', 'CBWD'],\n",
    "                             search_space=custom_search_space,\n",
    "                             reward_metric='mape',\n",
    "                             ...)\n",
    "\n",
    "model = experiment.run() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "方式二（简化版）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from hyperts import make_experiment\n",
    "from hyperts.datasets import load_network_traffic\n",
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "df = load_network_traffic(univariate=True)\n",
    "train_data, test_data = train_test_split(df, test_size=168, shuffle=False)\n",
    "\n",
    "custom_search_space = DLForecastSearchSpacePlusTransformer()\n",
    "\n",
    "experiment = make_experiment(train_data, \n",
    "                             task='univariate-forecast',\n",
    "                             mode='dl',\n",
    "                             timestamp='TimeStamp',\n",
    "                             covariables=['HourSin', 'WeekCos', 'CBWD'],\n",
    "                             search_space=custom_search_space,\n",
    "                             reward_metric='mape',\n",
    "                             ...)\n",
    "\n",
    "model = experiment.run() "
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "29dd11bf16d40fe19950dcc2f06dd773fb6bc4491ac296fb211bfed7a4a532da"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}